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1 SUMMING  ITI  M C11ARACTI  RiSTIC  OJRVI.S 


I.  INTRODUCTION 

Increased  interest  in  computer-driven  adaptive  testing,  automated  item  banking,  and  automated  test 
construction  has  made  the  estimation  of  the  Item  Characteristic  Curve  (ICC)  important.  This  curve 
describes  the  relationship  between  the  ability  of  individuals  and  the  probability  of  their  answering  a test 
question  correctly.  It  is  useful  in  estimating  test  scores,  equating  the  scores  of  various  tests,  and  scoring 
responses  during  adaptive  testing.  There  are  several  methods  for  estimating  ICC  within  available  computer 
programs.  Selection  and  implementation  of  the  appropriate  program  becomes  a task  for  the  practitioner. 
The  objective  of  this  study  is  to  compare  the  merits  of  four  available  computer  programs. 

The  Research  Problem 

In  order  to  estimate  an  ICC',  a conceptual  model  must  be  defined  and  item  parameters  must  be 
estimated.  The  three-parameter  logistic  model  of  Birnbaum  (Lord  & Noviek,  1968)  is  the  most  frequently 
used  for  relating  item  responses  to  subjects’  ability.  The  three  parameters,  a,  b , and  c,  are  item 
discrimination,  item  difficulty  (or  location),  and  probability  of  chance  success  (or  lower  asymptote), 
respectively. 

The  curve  described  by  these  parameters  takes  the  shape  of  an  (cumulative  frequency)  ogive  or  an  “s" 
with  the  upper  asymptote  approaching  a probability  of  1 .0  and  usually  a lower  asymptote  of  a probability 
greater  than  0.0.  The  ogive  describes  the  probability  of  obtaining  a correct  answer  to  an  item  as  a 
monotonic  increasing  function  of  ability. 

r The  item  discrimination  parameter  (a)  is  a function  of  the  slope  of  the  ICC  and  generally  ranges  from 

.5  to  about  2.5.  The  value  of  a equal  to  about  1 .0  is  typical  of  many  test  items,  while  a values  below  .5  are 
insufficiently  discriminating  for  most  testing  purposes,  and  a values  above  2.0  are  infrequently  found. 

The  item  difficulty  parameter  ( b ) describes  the  point  of  inflection  of  the  ICC  and  is  usually  scaled 
between  2.5  and  +2.5  although  the  metric  is  arbitrary. 

The  item  guessing  parameter  (<■)  is  the  lower  asymptote  of  the  ICC  and  is  generally  conceived  to  be 
the  probability  of  selecting  the  correct  item-option  by  chance  alone.  Most  test  items  have  c parameters 
greater  than  0.0  and  less  than  or  equal  to  .30. 

Figure  1 shows  three  ICCs.  The  horizontal  axis  is  scaled  in  units  of  ability  (0),  and  the  vertical  axis  is 
the  probability  of  answering  the  item  correctly.  The  solid  curved  line  shows  an  ICC  for  an  item  of  average 
difficulty  with  acceptable  discrimination  and  the  lower  asymptote  appropriate  for  a five-item 
multiple-choice  item.  The  dashed  line  shows  an  item  of  identical  difficulty . c value  of  .28,  but  with  a lower 
a value.  Note  how  the  slope  of  the  curve  is  less  steep.  The  third  curve,  dot-dash  line,  shows  an  item  with  a c 
value  of  .30,  an  a parameter  of  1 .0,  and  the  b parameter  equal  to  1 .0.  As  the  b parameter  changes,  the 
location  of  the  inflection  point  of  the  curve  is  displaced  along  the  horizontal  axis. 

In  most  cases  the  test  constructor  is  faced  with  the  task  of  estimating  three  parameters  for  the  n items 
and  one  ability  parameter  (0)  for  every  examinee  (N)  so  that  A'  + 3 n parameters  must  be  estimated  for  each 
group  of  test  items.  For  a group  of  2,000  examinees  taking  80  items,  2,240  [2,000  + (3  x 80)]  parameters 
must  be  estimated  simultaneously.  In  an  iterative  procedure,  this  estimation  must  be  repeated  several  times 
which  leads  to  long  computer  runs  with  more  precise  estimates.  Three  of  the  four  ICC  estimation 
procedures  evaluated  in  this  study  are  iterative.  The  fourth  is  a monotonic  increasing  function  of  the 
biscrial  correlation  between  the  item  and  raw  score. 
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Figure  I . Item  characteristic  curves. 


II.  METHOD 

A simulation  was  run  in  order  to  have  known  values  for  ability  level  (0)  and  for  the  item  parameters. 
Three  distributions  of  ability  (0)  with  differing  shapes  were  generated  on  which  to  test  the  procedures  for 
ICC  parameter  estimation.  Each  0 is  equivalent  to  a “subject.”  The  generated  item  parameters  ( a , b,  c) 
remained  constant  across  the  three  distributions  of  ability  (0). 

Four  methods  of  assessing  the  adequacy  of  the  ICC  estimation  procedures  were  used.  First,  the 
estimated  item  parameters  (a,  S,  c)  were  correlated  with  the  known  item  parameters:  second,  the  0 
estimated  by  using  a,  b , and  ‘c  from  each  estimation  procedure  was  correlated  with  the  known  0.  Third 
“true  scores"  and  estimated  “true  scores”  from  the  a,  b,  and  c were  compared  (Lord,  1975).  Finally,  the 
test  information  curve  was  compared  with  estimates  of  the  test  information  curve  using  the  item  parameters 
estimated  in  the  three  data  sets.  Table  1 shows  the  means,  standard  deviations,  and  minimum  and  maximum 
0 for  the  three  data  sets. 


Table  I.  Descriptive  Statistics  for  the  Distribution  of  0 for  the  Three  Data  Sets 


Data 

Standard 

Set 

Mean 

Deviation 

Minimum 

Maximum 

Skew 

Kurtosis 

r 


Data  Set  1 (DS1) 

The  distribution  of  0 for  DSl  was  generated  by  dividing  the  interval  between  2.5  and  +2.5  into 
2,000  equal  intervals  and  assigning  each  resultant  number  as  a value  of  0.  This  data  set  is  similar  to  those 
sometimes  produced  for  item  analytic  studies  for  tests  such  as  the  Armed  Services  Vocational  Aptitude 
Battery  (Jensen,  Massey,  & Valentine,  1976). 

Data  Set  2 (DS2) 

The  distribution  of  0 for  DS2  was  generated  by  obtaining  3,000  cases  front  a unit  normal  random 
number  generator.  Two  thousand  values  for  0 were  selected  by  administering  a “test”  and  generating  a sum 
of  the  number-right  scores  for  the  3,000  based  on  ICC  parameters  of  a 30-item  subtest  used  in  military 
selection  and  classification.  A cutting  score  was  set  which  would  yield  the  upper  two-thirds  of  the 
population.  This  method,  rather  than  just  cutting  at  a 0 = 333  percentile  equivalent,  was  used  to  emulate 
actual  selection  practices  which  involve  errors  of  measurement.  The  resultant  distribution  does  not  have  a 
sharp  truncation  of  0 but  is  asymmetric  with  few  scores  below  a specified  level.  DS2  is  similar  to  samples 
frequently  available  to  organizations  which  must  work  with  samples  selected  for  inclusion  in  training  or 
education. 


Data  Set  3 (DS3) 

The  distribution  of  0 for  DS3  was  generated  by  accessing  the  unit  normal  random  number  generator 
for  2,000  numbers. 

ICC  Parameters 

The  distributions  of  ICC  parameters  were  generated  to  simulate  80  five-option  multiple-choice  test 
questions.  A normal  distribution  was  specified  for  each  ICC  parameter.  The  means  and  standard  deviations 
of  these  distributions  were  set  to  produce  item  parameters  similar  to  those  likely  to  be  obtained  in  actual 
practice.  Table  2 describes  these  distributions. 


Table  2.  Descriptive  Statistics  of  the  Generated  ICC  Parameters 


ICC 

Parameter 

Mean 

Standard 

Deviation 

Minimum 

Maximum 

a 

.9504 

.2837 

.4647 

1.6136 

b 

.1635 

.9286 

-1.6530 

1.9745 

c 

.2009 

.0458 

.0872 

.3479 

Not*.  — These  ICC  parameters  were  used  for  all  three  data  sets. 


Generation  of  Item  Responses 

In  order  to  generate  a vector  of  item  responses  for  each  “subject”  the  © values  were  used  in  equation 
(1)  to  compute  the  likelihood  of  “passing”  each  item.  The  three  parameter  logistic  model  is  given  by: 


P(0)=  = q + ( 1 


(-1.7a:  (0-W) 

<•;)(!+ eV  n-1 


0) 


where  P(0);  is  the  probability  of  “subject”  j answering  the  test  item  correctly  and  av  b j,  and  q are  item 
parameters  for  item  i. 


Because  equation  (1)  yields  a number  P(0)j  such  that  0.0  < P(0)j  < 1 .0,  a number,  Xj,  is  drawn  from 
a uniform  (rectangular)  distribution  ranging  from  0.0  to  1.0  and  compared  to  P(0)j.  If  Xj  is  larger  than 
P(0)j,  then  an  incorrect  response  is  specified  for  the  item ; otherwise . a correct  response  is  specified  for  the 
item.  Thus,  a “subject”  with  P(0)j  = .90  gets  the  item  correct  9 in  10  times,  and  a vector  of  item  responses 
is  developed  for  each  “subject”  in  each  data  set.  These  response  vectors  are  then  used  to  estimate  a,  b,  and  c 
by  the  four  methods. 
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Estimation  of  ICC  Parameters 


The  following  lout  methods  of  ICC  estimation  wete  selected  because  of  their  wide  availability  to 
practitioners:  ANCILLES,  LOGIST,  (XilVlA,  and  transformations  to  the  item-test  biseria!  correlation.  All 
are  three-parameter  models. 

ANCILLES  and  OGIVIA  (developed  by  U.  S.  Civil  Service  Commission)  are  described  by  Urry  ( 1977, 
1978)  and  LOGIST  (developed  by  Educational  Testing  Service)  is  described  by  W<x>d,  Wingersky,  and  Lord 
(1976).  The  transformations  may  be  found  in  Lord  and  Novick  (1968).  These  procedures  were 
implemented  on  a UN1VAC  1 108  and  thoroughly  checked  out  by  processing  the  sample  data  set  supplied 
by  each  of  the  authors  of  the  programs.  Default  options  for  the  programs  were  specified  where  possible, 
and  the  logistic  model  was  used  throughout. 


III.  RESULTS 


The  first  set  of  analyses  consisted  of  correlating  the  ICC  parameters  with  the  estimated  ICC 
parameters  (a,  b , c).  Table  3 shows  these  results  for  each  data  set. 


Table  3.  Correlations  of  ICC  and  Estimated  ICC  Parameters 


ANCILLES 

LOGIST 

OGIVIA 

Transformation 

Data 

Set 

A 

ra.a 

A 

rb.b 

A 

rc.c 

A 

ra.a 

rb.'b 

rc.c 

rb/b 

rc.c 

ra.a 

A 

rb.b 

rc.2 

i 

.873 

.895 

.978 

.557 

.868 

.965 

.362 

.592 

.963 

* 

2** 

.440 

.941 

.565 

.447 

.233 

.556 

.923 

.323 

.917 

* 

3 

.836 

.968 

.325 

.827 

.975 

.379 

.837 

.976 

.225 

.349 

.965 

* 

* Consent  t value  of  c - .20  precludes  calculation  of  correlation. 

**Entries  for  ANC1I.I.ES  and  (XilVtA  based  on  75  and  64  items,  respectively. 


The  second  set  of  analyses  was  of  the  correlation  of  0 and  0 computed  using  a maximum  likelihood 
method  and  the  various  estimates  of  a,  b , and  c from  the  four  procedures.  These  correlations  were  analyzed 
to  determine  how  accurately  0 could  be  estimated  from  a,  b , and-? as  would  be  done  in  adaptive  testing. 

Maximum  Likelihood  Estimation  (MLE)  of  0 is  computed  using  the  likelihood  function  defined  as: 

L(0)  = I1(P(0)UQ(0)1  ~u)  (2) 

where  Q(0)  = I - P(0)  and  u is  1 if  the  item  was  answered  correctly  and  0 if  answered  otherwise.  The 
maximum  of  the  distribution  of  likelihoods  is  found  by  the  method  derived  by  Jensema  (1974).  The  use  of 
this  procedure  is  advantageous  because  it  allows  the  estimation  of  0 regardless  of  the  sequence  of  item 
administration.  Other  methods,  such  as  Bayesian  estimation  of  0,  are  sequence  dependent  (see  Sympson. 
1976). 

MLE  is  not  sequence  dependent  but  has  the  problems  of  possible  failure  to  converge  or  of  reaching  an 
asymptotically  infinite  estimate.  Both  of  these  problems  can  be  rectified  by  arbitrarily  placing  a limit  on 
the  number  of  iterations  and  by  placing  an  upper  and  lower  limit  on  0.  Maximum  Likelihood  Estimates  of 
0 were  computed  using  the  response  vectors  generated  from  equation  (1),  each  set  of  estimated  item 
parameters,  and  the  generated  item  parameters.  The  estimation  of  & using  the  generated  (a,  b,  c)  item 

parameters  indicates  the  bias  involved  in  the  estimation  of  © alone.  The  correlation  of  0 and  the  resultant 

_ /\  . A 

© is  a measure  of  test  reliability.  No  correlation  of  0 and  0 using  any  of  the  estimated  a,  b,  or  c parameters 
should  be  expected  to  exceed  the  correlation  of  0 and  $ using  the  generated  a,  b , c.  Table  4 shows  the 
results  of  these  analyses.  The  column  headed  Population  is  the  analysis  using  the  generated  item  parameters. 
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Tabic  4.  Descriptive  Statistics  for  the  Estimates  of  (-)  Computed 
from  the  Generated  and  Estimated  Item  Parameters 

(N  = 2,000) 


Estimation  Method 


s1 

Population 

ANCILLES 

LOGIST 

OGIVIA 

Transformation 

Rectangular  Data  Set 

Number  of  Items 

80 

80 

80 

80 

80 

80 

.A 

X0 

46.257 

.0181 

.0147 

-.0133 

.1004 

.0412 

00 

I4)  .629 

1 .4695 

.9223 

1.0163 

.9087 

.9038 

ri-).0 

.977 

.980 

.970 

.974 

.974 

.955 

(0*^) 

*r0  = -.00125 

S0  = 14437 

.0194  .0125  .0121 

Skewed  and  Selected  Data  Set 

.1016 

-.0400 

Number  of  Items 

80 

80 

75 

80 

64 

80 

xd 

52.565 

.5028 

-.0167 

.0316 

-.4219 

.0199 

06 

11.313 

.7483 

1.0147 

1 .0263 

.9174 

.9747 

r0.0 

.939 

.948 

.935 

.943 

.937 

.930 

(0-0) 

= .49574 

Z0  = .69989 

-.0071 

-.5123 

Normal  Data  Set 

-.4641 

-.9176 

-.4758 

Number  of  Items 

80 

80 

80 

80 

80 

80 

X0 

45.587 

.0096 

.0078 

-.0073 

.0706 

-.0038 

00 

14.615 

1.0362 

1 .0020 

1.0147 

.9899 

1.2313 

10.0 

.957 

.966 

.964 

.965 

.965 

.961 

(0— §>) 

Mq  =-.01269 

.0223 

.0204 

.0053 

.0833 

.0088 

Z0  = 1.0191 


indicates  number-right  score  and  the  all  descriptive  statistics  referred  to  the  number-right  score.  The  correlation  is 
between  ©andS. 

The  third  set  of  analyses  follows  guidance  proposed  by  Lord  (1975)  to  eliminate  most  of  the 
problems  associated  with  estimating  extreme  values  of  0.  These  are  termed  true  score  (£)  analyses.  Because 
MLE  procedures  tend  to  exhibit  bias  on  extreme  cases,  there  may  be  a piling-up  of  high  values  at  the 
minimum  and  maximum  values  allowed  by  the  particular  estimation  routine.  There  are  no  empirical  rules 
for  setting  either  minimum  or  maximum  values  to  be  obtained  in  the  MLE  process.  The  limits  set  depend 
on  judgment.  In  this  study,  the  values  were  set  at  -2.50  and  +2.50.  Other  values  might  have  yielded  slightly 
different  values  in  Table  4.  Estimation  of  true  scores  avoids  these  problems.  Equation  3 defines  true  score. 

*j  = £ Pj(0>  (3) 

i = I 

where  i|j  is  the  true  score,  n is  the  number  of  items,  and  Pj(0)  is  the  probability  of  a correct  response  for 


J 


9 


the  item  as  in  equation  ( 1 ).  Similarly,  the  estimated  true  score  is  given  by 


i = £ Pj(6)  <4) 

i = 1 

where  Pj(0)  is  computed  from  equation  (1)  usinga',$,  and  c. 

A 

Table  5 shows  the  means  and  standard  deviations  of  % and  £,  the  average  difference  between  them, 
and  their  intercorrelation. 


Table  5.  Descriptive  Statistics  of  £ and  £,  the  Average  Difference 
Between  Them  and  Their  Correlation 


Procedure 

r «' 

x? 

K-t> 

ANC1LLES 

.9927 

Data  Set  I 

46.444 

24.927 

.3444 

LOGIST 

.9960 

47.205 

23.424 

1.1059 

OGIVIA 

.9945 

45.210 

25.091 

- .8895 

Transformation 

.9910 

47.589 

24.352 

1 .4894 

jug  =46.099 

0£  = 19.245 

ANCILLES3 

.9995 

Data  Set  2 

54.63 

7.783 

5.3617 

LOGIST 

.9997 

58.02 

7.260 

5.531 

OGlVlAb 

.9994 

45.52 

7.7895 

- .4415 

Transformation 

.9999 

58.04 

8.028 

5.550 

= 52.49 

0£  = 10.592 

ANCILLES 

.9998 

Data  Set  3 

45.90 

14.325 

.5737 

LOGIST 

.9999 

46.085 

14.112 

.7591 

OGIVIA 

.9999 

45.158 

14.044 

-.1680 

Transformation 

.9999 

45.950 

14.157 

.6236 

^ =45.326 
o £ = 14.204 

a75  items  only  for  £ and  £. 
^64  items  only  for  £ ans  f. 


The  fourth  set  of  analyses  consisted  of  comparisons  of  the  test  information  curve  using  the  known  a 
h,  c versus  test  information  computed  from  a,  h,  c from  the  four-item  parameter  estimation  techniques. 

Item  information  is  defined  as 


lg(0) 


(30  8((')))J 
Pg(0)(1  Pg(0)) 


(5) 
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where  Pg(0)  is  estimated  from  equation  (1)  and  the  numerator  is  the  squared  first  derivative  (i.e.,  the 
squared  slope)  of  P„(0)  at  a fixed  value  of©.  Test  information  is  the  sum  of  the  item  information  curves 
making  up  a test  and  is  defined  as 

n 

'(©)  = 2 <6> 

i=l  S 

/> 

where  lg(0)  is  defined  in  equation  (5).  Estimates  o(  item  information  (I)  may  be  computed  by  substituting 
a,t,  c into  equation  ( 1)  and  substituting  that  quantity  into  equations  (5)  and  (6). 

It  is  useful  to  calculate  item  and  test  information  curves  in  order  to  determine  the  precision  of 
measurement  of  a test  or  an  item.  The  height  of  the  item  or  test  information  curve  at  any  level  of  0 may  be 
thought  of  as  being  an  ICC  analog  to  classical  measures  of  reliability.  The  higher  the  information  curve  the 
higher  the  information  value  and  the  higher  the  reliability  of  the  item  or  test  at  that  level  of  0. 

Test  information  curves  are  frequently  used  to  compare  test  characteristics  (Brown  & Weiss,  1977. 
McBride  & Weiss,  1976;  Vale  & Weiss,  1977;  and  Weiss,  1975)  and  to  select  items  for  administration  during 
adaptive  testing  (Jensema,  1974;  Ree,  1977).  Because  test  and  item  information  curves  are  computed  using 
ICC  parameters,  errors  of  estimation  of  the  parameters  can  cause  errors  in  the  test  and  item 
information  curves. 

Figures  2,  3,  and  4 show  the  test  information  curve  and  estimates  of  the  test  information  curve  based 
on  r?,  $,  ‘c’  estimated  by  the  four  methods  in  each  of  the  data  sets.  The  item  parameters  have  been  made 
comparable  by  placing  them  on  common  metric  via  a linear  transformation  of  a and  b.  No  such 
transformation  of  c is  necessary.  Table  6 presents  the  sum  of  squared  deviations  of  true  test  information 
minus  estimated  test  information  as  well  as  the  point  on  0 where  information  reaches  its  maximum  (0„), 
the  correlation  of  I and  1,  and  minimum  and  maximum  values  of  I computed  by  each  method  in  each  of  the 
data  sets. 

To*t  Information  Curv«c 


C 4- 


LOGIST 

OGIVIA 


Table  6.  Information  Analysis  and  Estimated  Information  Analyses 
Based  on  ICC  Parameters 


Total 

Mean 

Minimum 

Maximum 

, (T^i) 

mi  -V 

ru 

Test  Inforamtion  717.83  14.075  1.695  22.924  .800 


Estimated  Test  Information  Based  on  ICC  Parameters  from  DS1 


ANCILLES 

736.31 

14.437 

2.757 

23.708 

.100 

.362 

650.04 

.864 

LOGIST 

775.76 

15.211 

1.989 

24.314 

.600 

1.136 

137.96 

.986 

OGIVIA 

735.75 

14.426 

3.658 

22.607 

.000 

.351 

621.29 

.850 

Transformation 

930.77 

18.250 

1.477 

37.708 

.800 

-4.175 

2510.61 

.971 

Estimated  Test  Information  Based 

1 on  ICC  Parameters  from  DS2 

ANCILLES'1 

871.24 

17.083 

5.016 

29.954 

.900 

3.008 

694.15 

.970 

LOGIST 

835.54 

16.383 

.954 

28.096 

-.600 

- 2.308 

989.93 

.958 

OGIVlAb 

613.24 

12.024 

.338 

21.682 

.900 

2.051 

854.63 

.899 

Transformation 

806.66 

15.817 

2.360 

34.105 

1.00 

-1.742 

2361.61 

.821 

Estimated  Test  Information  Based  on  ICC  Parameters  from  DS3 


ANCILLES 

777.81 

15.251 

4.280 

23.906 

.400 

-1.1760 

174.48 

.976 

LOGIST 

762.35 

14.948 

1.539 

25.300 

.700 

.873 

102.67 

.994 

OGIVIA 

812.85 

15.938 

2.332 

24.022 

.800 

-1.863 

219.61 

.991 

Transformation 

1070.70 

20.993 

1.046 

47.565 

1.00 

-6.918 

6416.10 

961 

a75  items  only. 
^64  items  only. 


IV.  DISCUSSION 

The  results  clearly  indicate  that  no  one  program  functions  best  in  all  situations  posed  by  the  three 
dt  .a  sets.  The  transformation  procedure  performed  poorly  in  most  instances  and  is  not  recommended 
unless  no  other  procedures  are  available. 

In  the  rectangular  data  set  (DS1),  LOGIST  produces  results  superior  to  the  other  procedures  except 
in  terms  of  the  average  differences  between  % and  £.  The  correlations  of  estimated  item  parameters  and 
generated  item  parameters,  0 and  0,  I and  f,  and  i-  and  £,  are  higher  for  LOGIST  than  for  any  other 
procedure.  LOGIST  estimated  item  parameters  also  most  nearly  reproduce  the  test  information  curve. 

The  results  from  the  skewed  and  selected  data  set,  DS2,  call  attention  to  a peculiarity  exhibited  by 
ANCILLES  and  OGIVIA.  Under  specific  conditions,  these  two  programs  will  not  estimate  parameters  of 
some  items.  While  this  may  seem  a disadvantage,  notice  that  (i-  - £)  for  OGIVIA  is  the  smallest  in  DS2. 
Note  also  that  OGIVIA  shows  (Table  4)  an  r0.0  of  .937  for  64  items  compared  to  .943  for  80  items  using 
LOGIST.  This  increase  of  .006  is  very  small  for  the  addition  of  16  items.  LOGIST  estimates  item 
parameters  for  all  the  items,  but  inspection  of  the  scatter  plot  of  b versus  b indicates  several  outliers  which 
have  the  effect  of  substantially  reducing  the  value  of  r b.b.  All  the  estimated  test  information  curves 
computed  from  DS2  estimates  of  the  item  parameters  approximate  the  true  test  int  jrmation  curve  very 
poorly. 
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The  OGIVIA  procedure  is  the  most  preferable  for  use  in  the  normally  distributed  data  set, i)S3.  The 
correlations  of  OGIVIA  estimated  a and  b with  a and  b are  higher  than  for  the  other  procedures;  however, 
its  correlation  of  c and  c is  less  than  that  of  either  ANCILLES  or  LOG1ST.  The  r 0.0  using  OGIVIA  is  as 
high  as  LOGIST  and  higher  than  all  others.  The  r£.£  tor  OGIVIA  is  the  higjrest  and  simultaneously  has  the 
smallest  average  difference  between  \ and  £.  OGIVIA  is  built  around  assumptions  of  the  normality  of  the 
distribution  of  0 and  performs  very  well  when  these  conditions  hold  true,  as  in  DS3 , or  approximately  hold 
true,  as  in  DS2 . LOGIST  estimates  of  the  item  parameters  produce  the  highest  correlation  between  I and^ 
and  the  lowest  sum  of  squared  deviants  of  I minus'?' and  thus  the  best  estimate  test  information. 

The  decision  as  to  which  procedure  to  use  must  be  based  on  a series  of  criteria.  If  all  the  items  must 
be  calibrated,  then  OGIVIA  and  ANCILLES  may  present  problems  in  a situation  like  that  represented  by 
DS2.  If  wide  range  samples  like  DS1  and  DS3  are  available  and  the  estimation  of  C-)  is  the  goal,  then 
calibration  with  LOGIST  or  OGIVIA  is  recommended.  Clearly,  if  the  examinees  are  available,  a normal 
distribution  of  © leads  to  the  best  estimations  of  a,b,  c,  ij,  ©,  1 and  is  desirable.  These  data  should  then  be 
calibrated  using  OGIVIA. 

A final  factor  should  be  considered:  cost.  The  transformation  procedure  was  the  quickest  because, 
unlike  the  others,  it  is  not  iterative  and  its  work  can  be  accomplished  in  about  10  FORTRAN  statements. 
The  LOGIST  procedure  takes  the  longest  on  the  computer.  It  ran  eight  times  longer  than  either  ANCILLKS 
or  OGIVIA.  Central  Processor  Unit  (CPU)  times  on  a UNIVAC  1 108  with  262K  words  of  memory  for  DS3 
were  for  ANCILLES,  296  seconds;  LOGIST,  2,061  seconds;  OGIVIA,  180  seconds;  and  transformations,  38 
seconds. 

The  choice  of  ICC  parameter  estimation  techniques  should  be  consistent  with  the  planned  use  of  the 
estimates,  the  characteristics  of  the  distribution  of  ability  in  the  groups  available  for  item  administration, 
the  necessity  to  calibrate  all  items,  and  the  computer  resources  available. 
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